ticker <- "MSFT" Financial/Economic Data using TidyQuant
What is R?
Let’s briefly discuss R, RStudio and Quarto, and how they work together.
We’ll also explain what R “packages” are and how they add extra superpowers to R that make your life easier.
Stock Data
To gather and analyze financial data, we’ll explore using the tidyquant package - which allows us to directly download data on measures like stock prices as well as economic indicators from the Fed. By default stock data is sourced from Yahoo Finance, though the package also provides connectors to other sources.
Analyzing a company
Let’s see how we grab stock price data directly from the web and work with it. First we’ll assign the stock ticker/symbol for the company we’re interested in exploring.
Use the tq_get() function to download the stock data.
This function returns a data frame containing the date, open, high, low, and close prices for each day. For example:
stock_data <- tq_get(ticker, get = "stock.prices", from = "2024-01-01")
stock_data# A tibble: 314 × 8
symbol date open high low close volume adjusted
<chr> <date> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 MSFT 2024-01-02 374. 376. 367. 371. 25258600 367.
2 MSFT 2024-01-03 369. 373. 369. 371. 23083500 367.
3 MSFT 2024-01-04 371. 373. 367. 368. 20901500 364.
4 MSFT 2024-01-05 369. 372. 366. 368. 20987000 364.
5 MSFT 2024-01-08 369. 375. 369. 375. 23134000 371.
6 MSFT 2024-01-09 372. 376. 371. 376. 20830000 372.
7 MSFT 2024-01-10 376. 384. 376. 383. 25514200 379.
8 MSFT 2024-01-11 386 391. 380. 385. 27850800 381.
9 MSFT 2024-01-12 385. 389. 385. 388. 21645700 385.
10 MSFT 2024-01-16 394. 394. 388. 390. 27202300 387.
# ℹ 304 more rows
As you can see above, we can specify how far back we want the data to go. (You can also optionally set and end by using to = … if you don’t, it just defaults to the most recent day.)
Now here’s where it gets even more interesting and powerful… Let’s say instead of daily prices, you wish you could look at it on a monthly basis. Or annually. Or weekly.
Well you could write some custom code yourself aimed at pulling out just the records for the last day of the month – but you don’t have to come up with that: tidyquant has done it for you using its tq_transmute() function. (The function uses the powers of other financial packages such as xts, quantmod and TTR under the hood.)
Modify our data to be monthly instead, based on the last closing price of the month.
stocks_monthly <- stock_data %>%
group_by(symbol) %>%
tq_transmute(select = close,
mutate_fun = to.monthly,
indexAt = "lastof")
stocks_monthly# A tibble: 16 × 3
# Groups: symbol [1]
symbol date close
<chr> <date> <dbl>
1 MSFT 2024-01-31 398.
2 MSFT 2024-02-29 414.
3 MSFT 2024-03-31 421.
4 MSFT 2024-04-30 389.
5 MSFT 2024-05-31 415.
6 MSFT 2024-06-30 447.
7 MSFT 2024-07-31 418.
8 MSFT 2024-08-31 417.
9 MSFT 2024-09-30 430.
10 MSFT 2024-10-31 406.
11 MSFT 2024-11-30 423.
12 MSFT 2024-12-31 422.
13 MSFT 2025-01-31 415.
14 MSFT 2025-02-28 397.
15 MSFT 2025-03-31 375.
16 MSFT 2025-04-30 382.
Want to try annually instead? It’s just a matter of one small tweak. Check it out…
stock_data %>%
group_by(symbol) %>%
tq_transmute(select = close,
mutate_fun = to.yearly, #here's the change
indexAt = "lastof")# A tibble: 2 × 3
# Groups: symbol [1]
symbol date close
<chr> <date> <dbl>
1 MSFT 2024-12-31 422.
2 MSFT 2025-04-30 382.
Now of course, just a couple years of annual data isn’t very illuminating. But if we want to go back to start earlier in time, it’s as simple as just asking R for it.
stock_data_from2000 <- tq_get(ticker, get = "stock.prices", from = "2000-01-01")
stock_data_from2000# A tibble: 6,351 × 8
symbol date open high low close volume adjusted
<chr> <date> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 MSFT 2000-01-03 58.7 59.3 56 58.3 53228400 35.8
2 MSFT 2000-01-04 56.8 58.6 56.1 56.3 54119000 34.6
3 MSFT 2000-01-05 55.6 58.2 54.7 56.9 64059600 34.9
4 MSFT 2000-01-06 56.1 56.9 54.2 55 54976600 33.8
5 MSFT 2000-01-07 54.3 56.1 53.7 55.7 62013600 34.2
6 MSFT 2000-01-10 56.7 56.8 55.7 56.1 44963600 34.5
7 MSFT 2000-01-11 55.8 57.1 54.3 54.7 46743600 33.6
8 MSFT 2000-01-12 54.2 54.4 52.2 52.9 66532400 32.5
9 MSFT 2000-01-13 52.2 54.3 50.8 53.9 83144000 33.1
10 MSFT 2000-01-14 53.6 57.0 52.9 56.1 73416400 34.5
# ℹ 6,341 more rows
stock_data_from2000 %>%
group_by(symbol) %>%
tq_transmute(select = close,
mutate_fun = to.yearly, #here's the change
indexAt = "lastof")# A tibble: 26 × 3
# Groups: symbol [1]
symbol date close
<chr> <date> <dbl>
1 MSFT 2000-12-31 21.7
2 MSFT 2001-12-31 33.1
3 MSFT 2002-12-31 25.9
4 MSFT 2003-12-31 27.4
5 MSFT 2004-12-31 26.7
6 MSFT 2005-12-31 26.1
7 MSFT 2006-12-31 29.9
8 MSFT 2007-12-31 35.6
9 MSFT 2008-12-31 19.4
10 MSFT 2009-12-31 30.5
# ℹ 16 more rows
Keep in mind, depending on the use case, and as you get more comfortable with this, you can combine some of these steps together…
tq_get(ticker, get = "stock.prices", from = "2000-01-01") %>%
group_by(symbol) %>%
tq_transmute(select = close,
mutate_fun = to.yearly, #here's the change
indexAt = "lastof")# A tibble: 26 × 3
# Groups: symbol [1]
symbol date close
<chr> <date> <dbl>
1 MSFT 2000-12-31 21.7
2 MSFT 2001-12-31 33.1
3 MSFT 2002-12-31 25.9
4 MSFT 2003-12-31 27.4
5 MSFT 2004-12-31 26.7
6 MSFT 2005-12-31 26.1
7 MSFT 2006-12-31 29.9
8 MSFT 2007-12-31 35.6
9 MSFT 2008-12-31 19.4
10 MSFT 2009-12-31 30.5
# ℹ 16 more rows
There are all kinds of other questions we can ponder, and then pull together using relatively straightforward functions (all things considered).
Let’s say now that we have data going back to 2000, we’d like to also calculate what the annual return was for our company’s stock. We can do that like this:
stock_data_from2000 %>%
tq_transmute(select = close,
mutate_fun = periodReturn,
period = "yearly",
col_rename = "annual_return")# A tibble: 26 × 2
date annual_return
<date> <dbl>
1 2000-12-29 -0.628
2 2001-12-31 0.527
3 2002-12-31 -0.220
4 2003-12-31 0.0588
5 2004-12-31 -0.0237
6 2005-12-30 -0.0213
7 2006-12-29 0.142
8 2007-12-31 0.192
9 2008-12-31 -0.454
10 2009-12-31 0.568
# ℹ 16 more rows
Want to see monthly returns instead? It’s as simple as doing:
stock_data_from2000 %>%
tq_transmute(select = close,
mutate_fun = periodReturn,
period = "monthly", #here's the change
col_rename = "monthly_return")# A tibble: 304 × 2
date monthly_return
<date> <dbl>
1 2000-01-31 -0.160
2 2000-02-29 -0.0868
3 2000-03-31 0.189
4 2000-04-28 -0.344
5 2000-05-31 -0.103
6 2000-06-30 0.279
7 2000-07-31 -0.127
8 2000-08-31 0
9 2000-09-29 -0.136
10 2000-10-31 0.142
# ℹ 294 more rows
Now keep in mind what we did above used the closing price of the stock. But we might want to take into account dividends, stock splits, etc., which can affect as the stock’s value. If we want to adjust for these things to achieve a potentially more accurate picture of the stock’s returns over time, we can use the adjusted field in the data instead.
stock_data_from2000 %>%
tq_transmute(select = adjusted,
mutate_fun = periodReturn,
period = "yearly",
col_rename = "annual_return")# A tibble: 26 × 2
date annual_return
<date> <dbl>
1 2000-12-29 -0.628
2 2001-12-31 0.527
3 2002-12-31 -0.220
4 2003-12-31 0.0682
5 2004-12-31 0.0913
6 2005-12-30 -0.00938
7 2006-12-29 0.158
8 2007-12-31 0.208
9 2008-12-31 -0.444
10 2009-12-31 0.605
# ℹ 16 more rows
Visualizing
Want to visualize the returns?
We can do that too, using the ggplot2 package, augmented by tidyquant.
First let’s make sure we’ve saved our annual return dataset as a new named object.
annualreturns_data <- stock_data_from2000 %>%
tq_transmute(select = adjusted,
mutate_fun = periodReturn,
period = "yearly", #here's the change
col_rename = "annual_return")
annualreturns_data# A tibble: 26 × 2
date annual_return
<date> <dbl>
1 2000-12-29 -0.628
2 2001-12-31 0.527
3 2002-12-31 -0.220
4 2003-12-31 0.0682
5 2004-12-31 0.0913
6 2005-12-30 -0.00938
7 2006-12-29 0.158
8 2007-12-31 0.208
9 2008-12-31 -0.444
10 2009-12-31 0.605
# ℹ 16 more rows
Now we’ll create the chart.
annualreturns_data %>%
ggplot(aes(x = year(date), y = annual_return)) +
geom_col() +
labs(title = "Annual Returns",
subtitle = "2000 through the present",
y = "Returns", x = "", color = "") +
scale_y_continuous(labels = scales::percent) +
scale_x_reverse() + # this reverses the order of years
coord_flip() +
theme_tq() We can spruce up the chart in any way we want - let’s say for example we wanted to show colors tied to positive or negative returns.
annualreturns_data %>%
ggplot(aes(x = year(date), y = annual_return, fill = annual_return > 0)) +
geom_col() +
scale_fill_manual(values = c("firebrick", "forestgreen")) +
labs(title = "Annual Returns",
subtitle = "2000 through the present",
y = "Returns", x = "", fill = "") +
scale_y_continuous(labels = scales::percent) +
scale_x_reverse() + # this reverses the order of years
coord_flip() +
theme_tq() +
theme(legend.position = "none")What other kinds of visualizations can we do?
How about we create a line chart of the stock’s closing price over time. We can do this by using the geom_line() function of ggplot2. To simplify we’ll use our original stock data from 2020. For example:
ggplot(stock_data, aes(x = date, y = adjusted)) +
geom_line()If we wanted to add some labels…
stock_data %>%
ggplot(aes(x = date, y = adjusted)) +
geom_line() +
labs(title = "Stock Price", y = "Closing Price", x = "")Now once again, we have some very helpful financial measures built in to tidyquant. Let’s say we’d like to smooth things out here a bit, and calculate a moving average? The geom_ma() function delivers it for us.
# simple moving averages
stock_data %>%
ggplot(aes(x = date, y = close)) +
geom_line() +
labs(title = "Stock Price", y = "Closing Price", x = "") +
geom_ma(ma_fun = SMA, n = 50, color = "red", linewidth = 1.25) +
theme_minimal()Want a 30-day moving average instead? Well you just have make one tiny tweak.
stock_data %>%
ggplot(aes(x = date, y = close)) +
geom_line() +
labs(title = "Stock Price", y = "Closing Price", x = "") +
geom_ma(ma_fun = SMA, n = 30, color = "red", linewidth = 1.25) + #can you spot the change in this line?
theme_minimal()Maybe you’d acutally like to have both at the same time? No problem. We’ll distinguish the colors and line design type here to make it easier to see.
stock_data %>%
ggplot(aes(x = date, y = close)) +
geom_line() +
labs(title = "Stock Price", y = "Closing Price", x = "") +
geom_ma(ma_fun = SMA, n = 30, color = "blue", linewidth = 1.25, linetype = "dotted") +
geom_ma(ma_fun = SMA, n = 50, color = "red", linewidth = 1.25) +
theme_minimal()And remember once again, like we did earlier above, we could choose to look at weekly, monthly, or annual prices instead of daily.
Also, note that above we’re using a Simple Moving Average (SMA) for all of our analysis. But tidyquant also supports a range of other calculations, including:
Exponential moving averages (EMA) Weighted moving averages (WMA) Double exponential moving averages (DEMA) Zero-lag exponential moving averages (ZLEMA) Volume-weighted moving averages (VWMA) Elastic volume-weighted moving averages (EVWMA)
Multiple companies at once
You may be asking, could I grab data on more than one company, so I can compare them? Indeed.
mycompanies <- tq_get(c("AAPL", # Apple
"MSFT", # Microsoft
"NYT", # New York Times
"XOM"), # ExxonMobil
get = "stock.prices",
from = "2023-01-01")
mycompanies# A tibble: 2,256 × 8
symbol date open high low close volume adjusted
<chr> <date> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 AAPL 2023-01-03 130. 131. 124. 125. 112117500 124.
2 AAPL 2023-01-04 127. 129. 125. 126. 89113600 125.
3 AAPL 2023-01-05 127. 128. 125. 125. 80962700 124.
4 AAPL 2023-01-06 126. 130. 125. 130. 87754700 128.
5 AAPL 2023-01-09 130. 133. 130. 130. 70790800 129.
6 AAPL 2023-01-10 130. 131. 128. 131. 63896200 129.
7 AAPL 2023-01-11 131. 134. 130. 133. 69458900 132.
8 AAPL 2023-01-12 134. 134. 131. 133. 71379600 132.
9 AAPL 2023-01-13 132. 135. 132. 135. 57809700 133.
10 AAPL 2023-01-17 135. 137. 134. 136. 63646600 134.
# ℹ 2,246 more rows
mycompanies %>%
count(symbol)# A tibble: 4 × 2
symbol n
<chr> <int>
1 AAPL 564
2 MSFT 564
3 NYT 564
4 XOM 564
Now we’ll chart those out to compare, using almost identical code as above, but with some changes to allow small-multiple charts using facet_wrap().
mycompanies %>%
ggplot(aes(x = date, y = close)) +
geom_line() +
labs(title = "", y = "Closing Price", x = "") +
facet_wrap(~ symbol, ncol = 2, scale = "free_y")Want to add that moving average again? Can do that, too.
mycompanies %>%
ggplot(aes(x = date, y = close)) +
geom_line() +
labs(title = "", y = "Closing Price", x = "") +
geom_ma(ma_fun = SMA, n = 50, color = "red", size = 1.25) +
facet_wrap(~ symbol, ncol = 2, scale = "free_y") +
theme_minimal()Lowest or Highest Since X?
What if we wanted to say something like “The stock of Company X closed at its lowest price since X?”
How might we do that. We can use some additional R code using the tidyverse ecosystem of packages to perform some analysis that gives us that answer.
#using Tesla as example...
# we'll choose a moment where it's stock was particularly
# low, to show how this would work on a daily beat.
# We'll use the regular close for this example, but remember you may want to
# use the adjusted close, depending.
mydata <- tq_get("TSLA",
get = "stock.prices",
from = "2022-01-01",
to = "2025-03-11") %>%
arrange(desc(date))
mydata# A tibble: 798 × 8
symbol date open high low close volume adjusted
<chr> <date> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 TSLA 2025-03-10 253. 253. 220 222. 189076900 222.
2 TSLA 2025-03-07 259. 266. 251. 263. 102369600 263.
3 TSLA 2025-03-06 272. 273. 260. 263. 98451600 263.
4 TSLA 2025-03-05 273. 280. 268. 279. 94042900 279.
5 TSLA 2025-03-04 271. 284. 262. 272. 126706600 272.
6 TSLA 2025-03-03 300. 304. 277. 285. 115551400 285.
7 TSLA 2025-02-28 280. 294. 274. 293. 115697000 293.
8 TSLA 2025-02-27 291. 297. 281. 282. 101748200 282.
9 TSLA 2025-02-26 304. 309 288. 291. 100118300 291.
10 TSLA 2025-02-25 327. 329. 297. 303. 134228800 303.
# ℹ 788 more rows
# most recent
mydata %>%
slice_max(date, n = 1)# A tibble: 1 × 8
symbol date open high low close volume adjusted
<chr> <date> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 TSLA 2025-03-10 253. 253. 220 222. 189076900 222.
# store the most recent price value
mydata_current_price <- mydata %>%
slice_max(date, n = 1) %>%
pull(close)
mydata_current_price[1] 222.15
# search for at least as low
mydata %>%
filter(close <= mydata_current_price) %>%
select(symbol, date, close)# A tibble: 337 × 3
symbol date close
<chr> <date> <dbl>
1 TSLA 2025-03-10 222.
2 TSLA 2024-10-23 214.
3 TSLA 2024-10-22 218.
4 TSLA 2024-10-21 219.
5 TSLA 2024-10-18 221.
6 TSLA 2024-10-17 221.
7 TSLA 2024-10-16 221.
8 TSLA 2024-10-15 220.
9 TSLA 2024-10-14 219.
10 TSLA 2024-10-11 218.
# ℹ 327 more rows
# lowest closing prices for the company during the period captured?
mydata %>%
select(symbol, date, close) %>%
arrange(close)# A tibble: 798 × 3
symbol date close
<chr> <date> <dbl>
1 TSLA 2023-01-03 108.
2 TSLA 2022-12-27 109.
3 TSLA 2023-01-05 110.
4 TSLA 2022-12-28 113.
5 TSLA 2023-01-06 113.
6 TSLA 2023-01-04 114.
7 TSLA 2023-01-10 119.
8 TSLA 2023-01-09 120.
9 TSLA 2022-12-29 122.
10 TSLA 2023-01-13 122.
# ℹ 788 more rows
# highest
mydata %>%
select(symbol, date, close) %>%
arrange(desc(close))# A tibble: 798 × 3
symbol date close
<chr> <date> <dbl>
1 TSLA 2024-12-17 480.
2 TSLA 2024-12-16 463.
3 TSLA 2024-12-24 462.
4 TSLA 2024-12-26 454.
5 TSLA 2024-12-18 440.
6 TSLA 2024-12-13 436.
7 TSLA 2024-12-19 436.
8 TSLA 2024-12-27 432.
9 TSLA 2024-12-23 431.
10 TSLA 2025-01-15 428.
# ℹ 788 more rows
FRED Economic Data
A wealth of economic data can be extracted from the Federal Reserve Economic Data (FRED) database. FRED contains thousands of data sets that are free to use. See the FRED categories to narrow down the data base and to get data codes. categories: https://fred.stlouisfed.org/categories
Let’s talk about them and FRED’s vast richness of data for business reporting.
In addition to the work we’re doing here in R, for example, there is an Excel plug-in for FRED data you may find useful as well. There’s even a FRED mobile app.
National Examples
US Regular All Formulations Gas Price (GASREGW), weekly
gasprices <- tq_get("GASREGW", get = "economic.data", from = "2023-01-01")
gasprices# A tibble: 118 × 3
symbol date price
<chr> <date> <dbl>
1 GASREGW 2023-01-02 3.22
2 GASREGW 2023-01-09 3.26
3 GASREGW 2023-01-16 3.31
4 GASREGW 2023-01-23 3.42
5 GASREGW 2023-01-30 3.49
6 GASREGW 2023-02-06 3.44
7 GASREGW 2023-02-13 3.39
8 GASREGW 2023-02-20 3.38
9 GASREGW 2023-02-27 3.34
10 GASREGW 2023-03-06 3.39
# ℹ 108 more rows
gasprices %>%
ggplot(aes(x = date, y = price)) +
geom_line(color = "darkred") +
theme_minimal()30 year mortgage rate average, weekly
mortgate_30yr_weekly <- tq_get("MORTGAGE30US", get = "economic.data", from = "2023-01-01")
mortgate_30yr_weekly# A tibble: 118 × 3
symbol date price
<chr> <date> <dbl>
1 MORTGAGE30US 2023-01-05 6.48
2 MORTGAGE30US 2023-01-12 6.33
3 MORTGAGE30US 2023-01-19 6.15
4 MORTGAGE30US 2023-01-26 6.13
5 MORTGAGE30US 2023-02-02 6.09
6 MORTGAGE30US 2023-02-09 6.12
7 MORTGAGE30US 2023-02-16 6.32
8 MORTGAGE30US 2023-02-23 6.5
9 MORTGAGE30US 2023-03-02 6.65
10 MORTGAGE30US 2023-03-09 6.73
# ℹ 108 more rows
mortgate_30yr_weekly %>%
ggplot(aes(x = date, y = price)) +
geom_line(color = "darkred") +
theme_minimal()Consumer Price Index For All Urban Consumers, monthly
cpi <- tq_get("CPIAUCSL", get = "economic.data", from = "2018-01-01")
cpi # A tibble: 86 × 3
symbol date price
<chr> <date> <dbl>
1 CPIAUCSL 2018-01-01 249.
2 CPIAUCSL 2018-02-01 250.
3 CPIAUCSL 2018-03-01 250.
4 CPIAUCSL 2018-04-01 250.
5 CPIAUCSL 2018-05-01 251.
6 CPIAUCSL 2018-06-01 251.
7 CPIAUCSL 2018-07-01 251.
8 CPIAUCSL 2018-08-01 252.
9 CPIAUCSL 2018-09-01 252.
10 CPIAUCSL 2018-10-01 253.
# ℹ 76 more rows
cpi %>%
ggplot(aes(x = date, y = price)) +
geom_line(color = "darkred") +
theme_minimal()Unemployment rate, civilian, monthly
tq_get("UNRATE", get = "economic.data", from = "2018-01-01") %>%
ggplot(aes(x = date, y = price)) +
geom_line(color = "darkred") +
theme_minimal()TKTKTK TKTKTKTK
Localized examples
FRED also compiles numerous measures below the national level - you can find many for the local community or region in which you live:
https://fred.stlouisfed.org/categories/3008
For example, Per Capita Personal Income data (annual) is available at the county level in many states. Let’s look at Arlington County, Virginia, where we’re sitting right now.
arlington_pcpi <- tq_get("PCPI51013", get = "economic.data", from = "2010-01-01")
arlington_pcpi # A tibble: 14 × 3
symbol date price
<chr> <date> <int>
1 PCPI51013 2010-01-01 79216
2 PCPI51013 2011-01-01 82336
3 PCPI51013 2012-01-01 83499
4 PCPI51013 2013-01-01 79030
5 PCPI51013 2014-01-01 82623
6 PCPI51013 2015-01-01 85196
7 PCPI51013 2016-01-01 86290
8 PCPI51013 2017-01-01 89695
9 PCPI51013 2018-01-01 94324
10 PCPI51013 2019-01-01 96204
11 PCPI51013 2020-01-01 98411
12 PCPI51013 2021-01-01 109431
13 PCPI51013 2022-01-01 117273
14 PCPI51013 2023-01-01 124345
Keep in mind, for many such localized measures, the FRED website makes it easy to see a nationwide map of all counties (or the relevant geographies) tied to a measure, by selecting the “View Map” button. Let’s take a look.
Sticking with Arlington for a moment, let’s now examine some other measures.
How about the 5-year home ownership rate estimate.
arlington_5yhomeownership <- tq_get("HOWNRATEACS051013", get = "economic.data", from = "2010-01-01")
arlington_5yhomeownership # A tibble: 14 × 3
symbol date price
<chr> <date> <dbl>
1 HOWNRATEACS051013 2010-01-01 51.2
2 HOWNRATEACS051013 2011-01-01 50.6
3 HOWNRATEACS051013 2012-01-01 49.4
4 HOWNRATEACS051013 2013-01-01 48.7
5 HOWNRATEACS051013 2014-01-01 48.4
6 HOWNRATEACS051013 2015-01-01 48.0
7 HOWNRATEACS051013 2016-01-01 48.4
8 HOWNRATEACS051013 2017-01-01 48.4
9 HOWNRATEACS051013 2018-01-01 47.8
10 HOWNRATEACS051013 2019-01-01 47.4
11 HOWNRATEACS051013 2020-01-01 47.8
12 HOWNRATEACS051013 2021-01-01 47.6
13 HOWNRATEACS051013 2022-01-01 47.8
14 HOWNRATEACS051013 2023-01-01 48.0
How about the county’s Subprime Credit Population, by quarter. This data only begins in 2014, but…pretty interesting!
arlington_subprimepop <- tq_get("EQFXSUBPRIME051013", get = "economic.data", from = "2014-01-01")
arlington_subprimepop # A tibble: 43 × 3
symbol date price
<chr> <date> <dbl>
1 EQFXSUBPRIME051013 2014-04-01 14.7
2 EQFXSUBPRIME051013 2014-07-01 14.8
3 EQFXSUBPRIME051013 2014-10-01 15.3
4 EQFXSUBPRIME051013 2015-01-01 14.6
5 EQFXSUBPRIME051013 2015-04-01 14.4
6 EQFXSUBPRIME051013 2015-07-01 14.3
7 EQFXSUBPRIME051013 2015-10-01 14.1
8 EQFXSUBPRIME051013 2016-01-01 14.2
9 EQFXSUBPRIME051013 2016-04-01 13.9
10 EQFXSUBPRIME051013 2016-07-01 13.6
# ℹ 33 more rows
arlington_subprimepop %>%
ggplot(aes(x = date, y = price)) +
geom_line(color = "darkred") +
theme_minimal()How about right across the river in Washington, D.C. – if you happen to be curious how many businesses were formed on a monthly basis, you could pull business application data.
dc_bizapplications <- tq_get("BABATOTALSADC", get = "economic.data", from = "2014-01-01")
dc_bizapplications # A tibble: 134 × 3
symbol date price
<chr> <date> <int>
1 BABATOTALSADC 2014-01-01 786
2 BABATOTALSADC 2014-02-01 873
3 BABATOTALSADC 2014-03-01 768
4 BABATOTALSADC 2014-04-01 817
5 BABATOTALSADC 2014-05-01 763
6 BABATOTALSADC 2014-06-01 811
7 BABATOTALSADC 2014-07-01 736
8 BABATOTALSADC 2014-08-01 735
9 BABATOTALSADC 2014-09-01 765
10 BABATOTALSADC 2014-10-01 733
# ℹ 124 more rows
dc_bizapplications %>%
ggplot(aes(x = date, y = price)) +
geom_line(color = "darkred") +
theme_minimal()While we’re at it, how are home prices looking around the whole DC metro area? Let’s examine the Case-Shiller index…
tq_get("WDXRSA", get = "economic.data", from = "2014-01-01") %>%
ggplot(aes(x = date, y = price)) +
geom_line(color = "darkred") +
theme_minimal()As you can see, there are almost an infinite number of ways to use FRED data both to capture the local picture, as well as the national one, and search for information that can help fuel insightful reporting.
Getting data out of R
You may be wondering, ok this is pretty cool how R is able to deal with this data, but I might be one of the few people in my newsroom (or the only one!) who learns to use R.
Once I have what I want, how can I get things out of R if I want to share it with colleagues who only know spreadsheets?
Not to fear: you can export dataframes from R into lots of formats, including Excel.
Let’s do that now using the writexl pacakge.
We’ll go back to our original daily stock price dataset, which we compiled earlier:
head(stock_data)# A tibble: 6 × 8
symbol date open high low close volume adjusted
<chr> <date> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 MSFT 2024-01-02 374. 376. 367. 371. 25258600 367.
2 MSFT 2024-01-03 369. 373. 369. 371. 23083500 367.
3 MSFT 2024-01-04 371. 373. 367. 368. 20901500 364.
4 MSFT 2024-01-05 369. 372. 366. 368. 20987000 364.
5 MSFT 2024-01-08 369. 375. 369. 375. 23134000 371.
6 MSFT 2024-01-09 372. 376. 371. 376. 20830000 372.
Here’s how we can export to Excel:
writexl::write_xlsx(stock_data, "my_exported_file.xlsx")How about Google Sheets? Enter the googlesheets4 package.
Now let’s try something even neater: could we have a shared Google Sheet that we can update with new records when they are available? That your non-tech-saavy colleagues could use and you could automatically update as needed?
# sheet is here:
# https://docs.google.com/spreadsheets/d/1U0LVjDZUS_kV_GQxcSKU6Gebrl6jprUmxdkdfq9CYsc/edit?usp=sharing
# to authorize via web the first time
# gs4_auth()
# to automatically select a previously authorized token based on google account email
# storied locally in .Renviron
gs4_auth(Sys.getenv("GOOGLE_ACCOUNT_EMAIL"))
# import shared google sheet
# specify id
target_sheet_id <- "1fxFzpPC76KtgKkHtK0oXUuUYfo9EKdFUbFmgmAKq0eM"
# read in the sheet based on that id
live_gsheet <- read_sheet(target_sheet_id, sheet = "dailystockprices")✔ Reading from "Sabew 2025 Demo File - Stock Data".
✔ Range ''dailystockprices''.
head(live_gsheet)# A tibble: 6 × 8
symbol date open high low close volume adjusted
<chr> <dttm> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 MSFT 2024-01-02 00:00:00 374. 376. 367. 371. 25258600 367.
2 MSFT 2024-01-03 00:00:00 369. 373. 369. 371. 23083500 367.
3 MSFT 2024-01-04 00:00:00 371. 373. 367. 368. 20901500 364.
4 MSFT 2024-01-05 00:00:00 369. 372. 366. 368. 20987000 364.
5 MSFT 2024-01-08 00:00:00 369. 375. 369. 375. 23134000 371.
6 MSFT 2024-01-09 00:00:00 372. 376. 371. 376. 20830000 372.
# compare latest dataset with gsheet to identify newest records that aren't yet in gsheet
# anti join to spot new ones based on EO id number
new_records_toadd <- anti_join(stock_data, live_gsheet, by = "date")
new_records_toadd# A tibble: 0 × 8
# ℹ 8 variables: symbol <chr>, date <date>, open <dbl>, high <dbl>, low <dbl>,
# close <dbl>, volume <dbl>, adjusted <dbl>
# round the values to help match what's in google sheet
new_records_toadd <- new_records_toadd %>%
mutate(across(c(open, high, low, close, adjusted),
~round_half_up(., digits = 2)))
# now we'll append new records to live gsheet
sheet_append(target_sheet_id, new_records_toadd)✔ Writing to "Sabew 2025 Demo File - Stock Data".
✔ Appending 0 rows to 'dailystockprices'.